Complex Aggregation at Multiple Granularities

نویسندگان

  • Kenneth A. Ross
  • Divesh Srivastava
  • Damianos Chatziantoniou
چکیده

Datacube queries compute simple aggregates at multiple gran-ularities. In this paper we examine the more general and useful problem of computing a complex subquery involving multiple dependent aggregates at multiple granularities. We call such queries \multi-feature cubes." An example is \Broken down by all combinations of month and customer, nd the fraction of the total sales in 1996 of a particular item due to suppliers supplying within 10% of the minimum price (within the group), showing all subtotals across each dimension." We classify multi-feature cubes based on the extent to which ne granularity results can be used to compute coarse granularity results; this classiication includes distribu-tive, algebraic and holistic multi-feature cubes. We provide syntactic suucient conditions to determine when a multi-feature cube is either dis-tributive or algebraic. This distinction is important because, as we show, existing datacube evaluation algorithms can be used to compute multi-feature cubes that are distributive or algebraic, without any increase in I/O complexity. We evaluate the CPU performance of computing multi-feature cubes using the datacube evaluation algorithm of Ross and Srivastava. Using a variety of synthetic, benchmark and real-world data sets, we demonstrate that the CPU cost of evaluating distributive multi-feature cubes is comparable to that of evaluating simple datacubes. We also show that a variety of holistic multi-feature cubes can be evaluated with a manageable overhead compared to the distributive case.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal and spatio-temporal aggregations over data streams using multiple time granularities

Temporal and spatio-temporal aggregations are important but costly operations for applications that maintain time-evolving data (data warehouses, temporal databases, etc.). In this paper we examine the problem of computing such aggregates over data streams. The aggregates are maintained using multiple levels of temporal granularities: older data is aggregated using coarser granularities while m...

متن کامل

Temporal Aggregation over Data Streams Using Multiple Granularities

Temporal aggregation is an important but costly operation for applications that maintain time-evolving data (data warehouses, temporal databases, etc.). In this paper we examine the problem of computing temporal aggregates over data streams. Such aggregates are maintained using multiple levels of temporal granularities: older data is aggregated using coarser granularities while more recent data...

متن کامل

Multi-Granular Aspect Aggregation in Aspect-Based Sentiment Analysis

Aspect-based sentiment analysis estimates the sentiment expressed for each particular aspect (e.g., battery, screen) of an entity (e.g., smartphone). Different words or phrases, however, may be used to refer to the same aspect, and similar aspects may need to be aggregated at coarser or finer granularities to fit the available space or satisfy user preferences. We introduce the problem of aspec...

متن کامل

Querying Multiple Temporal Granularity Data

Managing and querying information with varying temporal granularities is an important problem in databases. Although there is a substantial body of work on temporal granularities for the relational data model [11], a comprehensive framework is lacking for the object-oriented paradigm. To the best of our knowledge, a formal treatment of temporal queries with multiple granularities has not been c...

متن کامل

Nanoscale Studies on Aggregation Phenomena in Nanofluids

Understanding the microscopic dispersion and aggregation of nanoparticles at nanoscale media has become an important challenge during the last decades. Nanoscale modeling techniques are the important tools to tackle many of the complex problems faced by engineers and scientists. Making progress in the investigations at nanoscale whether experimentally or computationally has helped understand th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998